Submission to metric track

“Football is two things. It’s blocking and tackling.”

Vince Lombardi

Introduction

Although this citation from one of the most successful head coaches in the history of the NFL dates back several decades — and the sport of American Football has of course changed since then — tackling remains an integral aspect of the game. In contrast to decision-making scenarios faced by players, such as a quarterback’s selection of a target for a pass, the decision for a tackle is more straightforward: a defense player should always promptly tackle the ball carrier.

When assessing players’ tackles, one is usually interested in a hypothetical scenario: the potential outcome if a player were to miss a tackle. Essentially, this involves quantifying the yards saved by a defensive player. Ideally, albeit impractically, running a play twice — once with the defense player executing the tackle and a second time without — would allow a direct comparison of the yardage gained by the ball carrier, thus enabling to evaluate the impact of the defensive player’s tackle.

Given the impracticability of such a hypothetical scenario, our approach involves approximating it by predicting the yard line of the ongoing play twice. First, we consider the inclusion of the closest defender who executed the tackle, and in a next step, we exclude this player. However, only quantifying the yards saved by a particular tackle does not suffice as an adequate measure of tackle value, due to lack of interpretability on a scale truly relevant to the game outcome. Therefore, we aim to produce a measure of tackle value on the scale of expected points (EP). EP can be viewed as a complicated mapping of the end of play yard line to the expected points in the next play. A sole point prediction of the mean yard line misses uncertainty propagation to the EP scale, such that we aim to produce a full conditional density estimate to calculate the expected points from. The metric derived from this methodology then quantifies the prevented expected points (PEP).

Data

To accurately predict the yard line at the end of any given play it is necessary to create several features derived from the tracking data. More specifically, we conducted the following feature preprocessing:

Change of coordinate system

We transformed the coordinate system by

  • redefining the x-variable as the x-distance to the endzone (such that all play directions are from right to left and the relevant endzone is at zero),
  • centering the y-variable such that the center of the field is at zero,
  • changing the direction variable, such that zero degrees represents heading straight towards the relevant endzone.

Response variable: Yards to be gained

For each play, we define the x-position of the ball carrier in the last frame as the end-of-play yard line. The response variable we aim to predict is now yards to be gained as the difference of the x-position of the ball carrier in a given frame to the end-of-play yard line.

Feature engineering

For all players and the ball carrier we use the features already contained in the tracking data, namely x- and y-coordinates, speed, acceleration, distance covered, orientation and direction.

For all players except the ball carrier we further compute the

  • euclidean distance to the ball carrier,
  • x-distance to the ball carrier,
  • y-distance to the ball carrier.

For defensive players only, we additionally compute the absolute difference of the defender’s direction and the angle of the shortest segment between the defender and the ball carrier.

Subsequently, we order all players (in each frame) with respect to their euclidean distance to the ball carrier and standardize all features.

For the identification of tackle events, we do not rely on the event column. Instead, we define as the tackle event the frame in which the distance of the tackler (whom we derive from the tackle event data set) to the ball carrier is minimal within a given play.

Schreiben welche plays wir rausgenommen haben und warum. Ja, würde ich kurz

…add an example play to illustrate the “what if” scenario… Hier die end of play yardline und “yards to be gained” einzeichnen.

Analysis

Our analysis comprises four steps:

1. Model Training

We train a model designed to predict the yards to be gained from which we can calculate the end-of-play yard line (see Yurko et al., 2020). The model uses the previously described features, only including the ten closest defenders and should account for potential non-linear and interaction effects. The time-series nature of the data suggests the usage of deep learning architectures such as transformers or LSTMs. However, we aim to go beyond point estimates for the yard line at the end of the play. Hence, we set up a conditional density estimator \(\hat{f}(y \mid x)\) which allows for adequate uncertainty propagation in the following steps. Thus, we opt for a middle-ground solution between accuracy in mean prediction and uncertainty quantification and consider a random forest comprising 1000 individual trees. Especially in our use case, modeling the uncertainty is important as the variance of the end-of-play yard line differs substantially between varying game situations.

RMSE und MAE reporten.

2. Tackler Replacement Procedure

For each tackle, we systematically remove the closest defender at the moment of the tackle and replace the features with those of the second closest defender. Further on, we replace the second closest with the third closest, and so on. In this way, we come up with a prediction for a hypothetical “what if the tackle would be missed” scenario which then can be compared to the real existing tackle.

3. Yardline Prediction

Using the trained random forest, we predict the end-of-play yard line with 1000 trees. Using a kernel density estimator for visualization, we can plot the dynamically evolving conditional density estimation withing any given play.

For the purpose of illustration, we present a specific example play. The video below shows a successful passing play from the Detroit Lions against the Miami Dolphins. After a completed pass, the receiver (in this case Tight End TJ Hockenson) is able to gain a substantial amount of yards by evading a tackle and is finally stopped only 12 yards before the endzone.

Below we display an animation of that same play (in the transformed coordinate system). At each frame, we add the conditional density of the yards to be gained from our model. There are a few observations: First at beginning of the play the density is rather narrow, because the model expects a tackle from the closest defender. As soon as TJ Hockenson is able to evade the first tackle, the density changes. The variance of the yards to be gained distribution increases and we even observe a bimodal distribution with a lot of mass at the endzone. Finally, at the time of tackle the distribution becomes quite narrow again, as we only expect the runner to make a few more yards.

Formulierung in obigen text sollte noch verbessert werden. War nur ein erster Ansatz< description

Zwischenzeitliche Bimodalität, Heteroskedastizität gut zu sehen in Anmiation.

For the tackle frames in particular, we obtain one predictive density, based on the original features, and one based on the replacement procedure explained above.

4. Tackle Evaluation

From a mathematical perspective, we want to obtain the mean expected points, given the conditional distribution of the end-of-play yard line produced by our random forest. More formally, letting the mapping \(g\) represent the calculation of expected points based on the end-of-play yard line \(y\) we are interested in

\[ \text{E}(g(Y) \mid x) = \int_{0}^{100} g(y) \: \hat{f} (y \mid x) \: dy \] For the mapping \(g\) we set up our own EP model based on an XGBoost architecture (Chen and Guestrin, 2016) which, after model training, can be used to calculate the expected points. In essence, we follow the implementation of the EP model from the nflfastR package (Carl and Baldwin, 2023). However, we have to derive all features used in the model solely from the predicted yards to be gained of each play which makes using the model from the nflfastR package impractical. As features for our model we employ the yard line of the play (adjusted LOS), yards to go, down, quarter, a home team indicator and timeouts remaining for each team (we omit features such as half seconds remaining, which are not extractable from the predicted yards to be gained). Wenn gewünscht kann ich noch Model comparisons (MAE oder so) vom unserem Modell zum Modell von nflfastR angeben. Um zu zeigen, dass unser Modell ähnlich gut funktioniert. RM: Ich glaube sowas ist gut für den Appendix.

Instead of a two-step procedure, first obtaining a kernel density estimate from the individual tree predictions (introducing subjectivity in bandwidth choice) and then integrating numerically, we treat the tree-predictions \(\hat{y}_1, \dots, \hat{y}_{1000}\) as samples from the conditional density and approximate the above expectation via the Monte Carlo estimate \[ \frac{1}{1000} \sum_{i=1}^{1000} g(\hat{y}_i). \]

A metric for quantifying tackle value can now be obtained in two ways. First, the hypothetical mean expected points can be compared to the mean expected points based on the predicted conditional density using the original features, i.e. \[ \text{PEP}_1 = \text{E}(g(Y) \mid x_{removed}) - \text{E}(g(Y) \mid x_0), \] where \(x_{removed}\) denotes the transformed features after removing the closest defender and \(x_0\) denotes the original features. Second, the latter term can be replaced by the expected points based on true observed end-of-play yard line \(y_0\), as \[ \text{PEP}_2 = \text{E}(g(Y) \mid x_{removed}) - g(y_0). \]

\(\text{PEP}_1\) is similar to what is regarded a treatment effect in the literature representing the average expected points prevented by the tackle, given the specific game situation. In contrast, \(\text{PEP}_2\) quantifies the prevented expected points by a real observed tackle. This is relevant for e.g. player evaluation as players might over- or underperform. Text überarbeiten?.

Wir müssten zumindest noch kurz auf Training und Test set eingehen, oder?.

The below Figure See Figure @ref(fig:overview). summarizes the above steps.

Graphical overview of the steps of our analysis.

Graphical overview of the steps of our analysis.

Player evaluation

In this section we evaluate the PEP values at a player level for the test data. In particular, we calculate for each player his cumulative \(\text{PEP}_1\) and \(\text{PEP}_2\) and display these together with the average \(\text{PEP}_1\) and \(\text{PEP}_2\) per tackle. To get a reasonable average for these, we set the minimum number of tackles to ten. Users are then invited to compare players by selecting different teams and clicking on different positions.

When comparing players in the above table across different positions, it quickly becomes apparent that, for example, defensive tackles do not have similar PEP values to safeties. So we want to compare the cumulative \(\text{PEP}_2\) of players at the same position with players at other positions. In doing so, we can see in the lower left graph that some positions, as already indicated above, turn out to be low PEP positions. This is mainly due to the fact that defensive ends, nose tackles or defensive tackles are protected by linebackers, safeties and cornerbacks. On the other hand, inside linebackers have the highest PEP values. This is due to the fact that players in this position tackle the most. Comparing the average PEP per position in the bottom right graph, we can see that inside linebackers fall in the order of importance and are overtaken by cornerbacks and safeties. This can be explained by considering that they are often the last to tackle, before the opposing players gain many yards or even score a TD.

Discussion

In this contribution, we developed the metric PEP for quantifying the value of tackles. It allows practitioners to assess players, particularly in terms of their tackling abilities. Ole fasst zusammen und sagt wie toll unser Model ist :-) < description However, there are some drawbacks that need to be taken into account during the evaluation process. One important consideration is the defensive playing style of teams. We have already seen that cornerbacks and safeties are often more likely to make “touchdown-saving tackles”, so their scores tend to be higher. If safeties are also involved in riskier defensive strategies (e.g. in a Brian Flores defence) and therefore receive even less support from their teammates in the backfield, their values could increase even more. Another area for future research is defensive turnovers. We neglected these plays in our analysis because their specific outcome was not compatible with our current EP model. In general, however, those tackles that lead to defensive turnovers are positive plays, for which players who often tackle and force the ball loose for a fumble (such as Haason Reddick) should be rewarded with extra-positive values.
In summary, the metrics we have developed can serve as an additional piece of the puzzle in the overall evaluation of (defensive) players, and may gain practical relevance in the process of scouting players and opponents.

Code

All code for data preprocessing, model training, prediction and player evaluation can be found here.

References